The documentation suggests running through both notebook and colab. On Windows computers, the predictive part suffers from a coding error. Using colab will force the visualization to be terminated when visualizing known data because it contains a large amount of data.
Visualize all current charging points and their exact locations, and predict the number of charging points in each city in 2024 based on the IEA's historical records. (golbal) Forecast based on available data, subject to error
import pandas as pd
import folium
from folium.plugins import MarkerCluster
# Load the dataset directly from GitHub
station_data_url = "https://raw.githubusercontent.com/Chameleon-company/EVCFLO/a7f84a3b80fa308c111981ba860584ef4d564c3b/datasets/T2_2023/Station_Compiler_Location_T2_2023/Station_Compiler_Location_T2_2023.csv"
station_data = pd.read_csv(station_data_url)
Check for duplicates, outliers and nulls, and remove outliers
# Check for duplicates based on Latitude and Longitude
duplicates = station_data[station_data.duplicated(subset=['Latitude', 'Longitude'], keep=False)]
print(f"There are {len(duplicates)} potential duplicate charging points based on geographic location.")
There are 2711 potential duplicate charging points based on geographic location.
has_nan = station_data.isna().any().any()
print(has_nan)
True
invalid_latitudes = station_data[(station_data['Latitude'] < -90) | (station_data['Latitude'] > 90)]
print(invalid_latitudes)
Service_Station_Location Latitude Longitude 1696 Carrer del Pare Josep Manxarrell, 11 (Eivissa) 398.918 1.4349
# Drop rows with NaN values in Latitude and Longitude columns
station_data_cleaned = station_data.dropna(subset=['Latitude', 'Longitude'])
# Drop exact duplicates based on Latitude, Longitude, and Service_Station_Location
station_data_cleaned = station_data_cleaned.drop_duplicates(subset=['Latitude', 'Longitude', 'Service_Station_Location'], keep='first')
# Drop exact basic on Latitude and Longitude columns
station_data_cleaned = station_data_cleaned[(station_data_cleaned['Latitude'] >= -90) & (station_data_cleaned['Latitude'] <= 90)]
print(station_data_cleaned[['Latitude', 'Longitude']].isnull().sum())
Latitude 0 Longitude 0 dtype: int64
!!! Warning: Please run this visualization locally. Because of the large amount of data, it will be forcibly terminated by Google.
m = folium.Map(location=[station_data_cleaned['Latitude'].mean(), station_data_cleaned['Longitude'].mean()], zoom_start=10)
marker_cluster = MarkerCluster().add_to(m)
for index, row in station_data_cleaned.iterrows():
folium.Marker(
location=[row['Latitude'], row['Longitude']],
tooltip=row['Service_Station_Location']
).add_to(marker_cluster)
m